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W ' Abstract 

o . 

This paper addresses the problem of averaging numbers across a wireless network from an 
important, but largely neglected, viewpoint: bandwidth/energy efficiency. We show that existing 
distributed averaging schemes have several drawbacks and are inefficient, producing networked 
dynamical systems that evolve with wasteful communications. Motivated by this, we develop 
.^ Controlled Hopwise Averaging (CHA), a distributed asynchronous algorithm that attempts to 

^ ' "make the most" out of each iteration by fully exploiting the broadcast nature of wireless medium 

t^^ . and enabling control of when to initiate an iteration. We show that CHA admits a common 

Qv^ , quadratic Lyapunov function for analysis, derive bounds on its exponential convergence rate, 

CN I and show that they outperform the convergence rate of Pairwise Averaging for some common 

lO ' graphs. We also introduce a new way to apply Lyapunov stability theory, using the Lyapunov 

^""^ ■ function to perform greedy, decentralized, feedback iteration control. Finally, through extensive 



o 



simulation on random geometric graphs, we show that CHA is substantially more efficient than 
several existing schemes, requiring far fewer transmissions to complete an averaging task. 



c^ ■ 1 Introduction 

Averaging numbers across a network is a need that arises in many applications of mobile ad 
hoc networks and wireless sensor networks. In order to cohaboratively accomplish a task, nodes 
often have to compute the network- wide average of their individual observations. For examples, 
by averaging their individual throughputs, an ad hoc network of computers can assess how well 
the network, as a whole, is performing, and by averaging their humidity measurements, a wireless 
network of sensing agents can cooperatively detect the occurrence of local, deviation-from-average 



*This work was supported by the National Science Foundation under grant CMMI-0900806. 



anomalies. Therefore, methods that enable such computation are of notable interest. Moreover, for 
performance reasons, it is desirable that the methods developed be robust, scalable, and efficient. 

In principle, computation of network-wide averages may be accomplished via flooding, whereby 
every node floods the network with its observation, as well as centralized computation, whereby a 
central node uses an overlay tree to collect all the node observations, calculate their average, and 
send it back to every node. These two methods, unfortunately, have serious limitations: flooding is 
extremely bandwidth and energy inefficient because it propagates redundant information across the 
network, ignoring the fact that the ultimate goal is to simply determine the average. Centralized 
computation, on the other hand, is vulnerable to node mobility, node membership changes, and 
single-point failures, making it necessary to frequently maintain the overlay tree and occasionally 
start over with a new central node, both of which are rather costly to implement. 

The limitations of flooding and centralized computation have motivated the search for distributed 
averaging algorithms that require neither flooding of node observations, nor construction of overlay 
trees and routing tables, to execute. To date, numerous such algorithms have been developed in 
continuous-time [lH3] as well as in discrete-time for both synchronous [T1I3HTT] and asynchronous 
[T0 l[T2HT9] models. The closely related topic of distributed consensus, where nodes seek to achieve 
an arbitrary network-wide consensus on their individual opinions, has also been extensively studied; 
see |20il21j for early treatments, [H ll0t[22Vl29j for more recent work, and [30] for a survey. 

Although the current literature offers a rich collection of distributed averaging schemes along 
with in-depth analysis of their behaviors, their efficacy from a bandwidth/energy efficiency stand- 
point has not been examined. This paper is devoted to studying the distributed averaging problem 
from this standpoint. Its contributions are as follows: we first show that the existing schemes — 
regardless of whether they are developed in continuous- or discrete-time, for synchronous or asyn- 
chronous models — have a few deficiencies and are inefficient, producing networked dynamical sys- 
tems that evolve with wasteful communications. To address these issues, we develop Random 
Hopwise Averaging (RHA), an asynchronous distributed averaging algorithm with several positive 
features, including a novel one among the asynchronous schemes: an ability to fully exploit the 
broadcast nature of wireless medium, so that no overheard information is ever wastefully discarded. 
We show that RHA admits a common quadratic Lyapunov function, is almost surely asymptotically 
convergent, and eliminates all but one of the deficiencies facing the existing schemes. 

To tackle the remaining deficiency, on lack of control, we introduce the concept of feedback 
iteration control, whereby individual nodes use feedback to control when to initiate an iteration. 
Although simple and intuitive, this concept, somewhat surprisingly, has not been explored in the 
literature on distributed averaging [THE] and distributed consensus [H IT0l[20l - [30] . We show that 
RHA, along with the common quadratic Lyapunov function, exhibits features that enable a greedy, 
decentralized approach to feedback iteration control, which leads to bandwidth/energy-efficient 
iterations at zero feedback cost. Based on this approach, we present two modified versions of RHA: 
an ideal version referred to as Ideal Controlled Hopwise Averaging (ICHA), and a practical one 



referred to simply as Controlled Hopwise Averaging (CHA). We show that ICHA yields a networked 
dynamical system with state-dependent switching, derive deterministic bounds on its exponential 
convergence rate for general and specific graphs, and show that the bounds are better than the 
stochastic convergence rate of Pairwise Averaging |10U12j for path, cycle, and complete graphs. We 
also show that CHA is able to closely mimic the behavior of ICHA, achieving the same bounds on 
its convergence rate. Finally, via extensive simulation on random geometric graphs, we demonstrate 
that CHA is substantially more bandwidth/energy efficient than Pairwise Averaging [12] , Consensus 
Propagation [18] , Algorithm A2 of |19j , and Distributed Random Grouping |17j , requiring far fewer 
transmissions to complete an averaging task. In particular, CHA is twice more efficient than the 
most efficient existing scheme when the network is sparsely connected. 

The outline of this paper is as follows: Section [2] formulates the distributed averaging problem. 
Section [3] describes the deficiencies of the existing schemes. Sections [H and \5\ develop RHA and 
CHA and characterize their convergence properties. In Section [6l their comparison with several 
existing schemes is carried out. Finally, Section [7| concludes the paper. The proofs of the main 
results are included in Appendix [XI 

2 Problem Formulation 

Consider a multi-hop wireless network consisting of A^ > 2 nodes, connected by L bidirectional 
links in a fixed topology. The network is modeled as a connected, undirected graph Q = (V,<S), 
where V = {1, 2, ... , N} represents the set of N nodes (vertices) and £ C {{i, j} : i,j £ V,i j^ j} 
represents the set of L links (edges). Any two nodes i,j € V are one- hop neighbors and can 
communicate if and only if {i,j} € £. The set of one-hop neighbors of each node z G V is denoted 
as A/i = {j G V : {i,j} G £}, and the communications are assumed to be delay- and error-free, with 
no quantization. Each node i G V observes a scalar yi G M, and all the N nodes wish to determine 
the network-wide average x* G M of their individual observations, given by 
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Given the above model, the problem addressed in this paper is how to construct a distributed 
averaging algorithm — continuous- or discrete-time, synchronous or otherwise — with which each 
node i £ V repeatedly communicates with its one-hop neighbors, iteratively updates its estimate 
Xj G M of the unknown average x* in ([1]), and asymptotically drives Xi to x* — all while consuming 
bandwidth and energy efficiently. 

The bandwidth/energy efficiency of an algorithm is measured by the number of real-number 
transmissions it needs to drive all the Xi's to a sufficiently small neighborhood of x*, essentially 
completing the averaging task. This quantity is a natural measure of efficiency because the smaller 
it is, the lesser bandwidth is occupied, the lesser energy is expended for communications, and the 



faster an averaging task may be completed. These, in tm'n, imply more bandwidth and time for 
other tasks, smaller probability of collision, longer lifetime for battery-powered nodes, and possible 
earlier return to sleep mode, all of which are desirable. The quantity also allows algorithms with 
different numbers of real-number transmissions per iteration to be fairly compared. Although, in 
networking, every message inevitably contains overhead (e.g., transmitter /receiver IDs and message 
type), we exclude such overhead when measuring efficiency since it is not inherent to an algorithm, 
may be reduced by piggybacking messages, and becomes negligible when averaging long vectors. 

3 Deficiencies of Existing Schemes 

As was pointed out in Section [H the current literature offers a variety of distributed averaging 
schemes for solving the problem formulated in Section [2l Unfortunately, as is explained below, they 
suffer from a number of deficiencies, especially a lack of bandwidth/energy efficiency, by producing 
networked dynamical systems that evolve with wasteful real-number transmissions. 

The continuous-time algorithms in [IHS] have the following deficiency: 

Dl. Costly discretization: As immensely inefficient as flooding is, the continuous-time algorithms 
in [IHH] may be more so: flooding only requires A^^ real-number transmissions for all the A^ 
nodes to exactly determine the average x* (since it takes A^ real-number transmissions for 
each node i £ V to flood the network with its yi), whereas these algorithms may need far 
more than that to essentially complete an averaging task. For instance, the algorithm in [1] 
updates the estimates Xj's of x* according to the differential equation 

^=^(%(t)-x.(t)), VzGV. (2) 

To realize ([2]), each node i £ V has to continuously monitor the Xj{t) of every one-hop neighbor 
j G Mi- If this can be done without wireless communications (e.g., by direct sensing), then 
the bandwidth/energy efficiency issue is moot. If wireless communications must be employed, 
then ([2]) has to be discretized, either exactly via a zero-order hold, i.e., 

Xiiik + 1)T) = Y^ hijXjikT), yi G V, (3) 

jev 

or approximately via numerical techniques such as the Euler forward difference method, i.e., 
f = Z^ (^i(^^) - ^iif^T)), yi G V, (4) 

where each h^j G M is the ij-entry of e~^'^, L G W^^^ is the Laplacian matrix of the graph 
Q that governs the dynamics ([2|), and T > is the sampling period. Regardless of ([3]) or (jl]), 
they may be far more costly to realize than flooding: with ([3|), A^^ real- number transmissions 



are already needed per iteration (since, in general, hij ^ Vi,j G V, so that each node i G V 
has to flood the network with its Xi{kT), for every k). In contrast, with ([4]), only A^ real- 
number transmissions are needed per iteration (since each node i G V only has to wirelessly 
transmit its Xi{kT) once, to every one-hop neighbor j £ Mi, for every k). However, the 
number of iterations, needed for all the Xi{kT)'s to converge to an acceptable neighborhood 
of X*, may be very large, since T must be sufficiently small for (HD to be stable. If this number 
exceeds N — which is possible and likely so with a conservatively small T — then @ would be 
worse than flooding (flooding is, of course, more storage and bookkeeping intensive). 

The discrete-time synchronous algorithms in p!} [31-111] have the following deficiencies: 

D2. Clock synchronization: The discrete-time synchronous algorithms in p!|[3Hllj require all the 
A^ nodes to always have the same clock to operate. Although techniques for reducing clock 
synchronization errors are available, it is still desirable that this requirement can be removed. 

D3. Forced transmissions: The algorithms in p!|[3|[5HIO] update the estimates Xj's of x* according 
to the difference equation 

Xi{k + 1) = Wii{k)xi{k) + ^ ■Wij{k)xj{k), Mi £ V, (5) 

where each Wij{k) G M is a weighting factor that is typically constant. The Wij{k)^s may be 
specified in several ways, including choosing them to maximize the convergence rate [5] or 
minimize the mean-square deviation [U]. However, no matter how the Wij{kys are chosen, 
these algorithms are bandwidth/energy inefficient because the underlying update rule ([5]) 
simply forces every node i G V at each iteration k to transmit its Xi{k) to its one-hop 
neighbors, irrespective of whether such transmissions are worthy. It is possible, for example, 
that the Xj(A;)'s of a cluster of nearby nodes are almost equal, so that their Xi{k + l)'s, being 
convex combinations of their Xi{kys, are also almost equal, causing their transmissions to be 
unworthy. The fact that A^ real-number transmissions are needed per iteration also implies 
that ([5]) must drive all the Xj(/c)'s to an acceptable neighborhood of x* within at most A'^ 
iterations, in order to just outperform hooding. 
D4. Computing intermediate quantities: The scheme in [8] uses two parallel runs of a consensus 
algorithm to obtain two consensus values and defines each Xi{k) as the ratio of these two 
values. While possible, this scheme is likely inefficient because it attempts to compute two 
intermediate quantities, as opposed to computing x* directly. 

The discrete-time asynchronous algorithms in |10 yi21[T9] have the following deficiencies: 

D5. Wasted receptions: Each iteration of Pairwise Averaging [12], Anti-Entropy Aggregation [13i 
114] , Randomized Gossip Algorithm [TBJ, and Accelerated Gossip Algorithm [IB] involves a 
pair of nodes transmitting to each other their state variables. Due to the broadcast nature 



of wireless medium, their transmissions are overheard by unintended nearby nodes, who 
would immediately discard this "free" information, instead of using it to possibly speed up 
convergence, enhancing bandwidth/energy efficiency. Hence, these algorithms result in wasted 
receptions. The same can be said about Consensus Propagation [18] and Algorithm A2 of |19j . 
although they do not assume pairwise exchanges. It can also be said about Distributed 
Random Grouping |17j, which only slightly exploits such broadcast nature: the leader of a 
group does, but the members, who contribute the majority of the transmissions, do not. 

D6. Overlapping iterations: Pairwise Averaging |12j . Anti-Entropy Aggregation |13yi4]. Random- 
ized Gossip Algorithm [K], Accelerated Gossip Algorithm [TB], and Distributed Random 
Grouping |17j require sequential transmissions from multiple nodes to execute an iteration. 
This suggests that before an iteration completes, the nodes involved may be asked to par- 
ticipate in other iterations initiated by those unaware of the ongoing iteration. Thus, these 
algorithms are prone to overlapping iterations and, therefore, to deadlock situations |19j . It 
is noted that this practical issue is naturally avoided by Consensus Propagation [T8] and 
explicitly handled by Algorithms Al and A2 of [19] . 

D7. Uncontrolled iterations: The discrete-time asynchronous algorithms in |12fll9j do not let 
individual nodes use information available to them during runtime (e.g., history of the state 
variables they locally maintain) to control when to initiate an iteration and who to include in 
the iteration. Indeed, Pairwise Averaging |12) . Anti-Entropy Aggregation |13M14| . Accelerated 
Gossip Algorithm [16] , Consensus Propagation |18| , and Algorithm A2 of [19] focus mostly on 
how nodes would update their state variables during an iteration, saying little about how they 
could use such information to control the iterations. Randomized Gossip Algorithm [15J and 
Distributed Random Grouping [TTj, on the other hand, let nodes randomly initiate an iteration 
according to some probabilities. Although these probabilities may be optimized [151117]. the 
optimization is carried out a priori, dependent only on the graph Q and independent of the 
nodes' state variables during runtime. Consequently, wasteful iterations may occur, despite 
the optimality. For instance, suppose Randomized Gossip Algorithm |15j is utilized, and a 
pair of adjacent nodes i,j £ V have just finished gossiping with each other, so that Xj and 
Xj are equal. Since the optimal probabilities are generally nonzero, nodes i and j may gossip 
with each other again before any of them gossips with someone else, causing Xi and Xj to 
remain unchanged, wasting that particular gossip. Similarly, suppose Distributed Random 
Grouping [T7] is employed, and a node i £ V has just finished leading an iteration, so that 
Xi and Xj Vj G J\fi are equal. Due again to nonzero probabilities, node i may lead another 
iteration before any of its one- or two-hop neighbors leads an iteration, causing Xi and Xj 
\/j G Mi to stay the same, wasting that particular iteration. These examples suggest that 
not letting nodes control the iterations is detrimental to bandwidth/energy efficiency and, 
conceivably, letting them do so may cut down on wasteful iterations, improving efficiency. 

D8. Steady- state errors: Consensus Propagation ^18j ensures that all the Xj's asymptotically con- 
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verge to the same steady-state value. However, this value is, in general, not equal to x* (see 
Figure [3] of Section [6] for an illustration). Although the error can be made arbitrarily small, 
it comes at the expense of increasingly slow convergence |18j . which is undesirable. 
D9. Lack of convergence guarantees: Accelerated Gossip Algorithm |16| . developed based on the 
power method in numerical analysis, is shown by simulation to have the potential of speeding 
up the convergence of Randomized Gossip Algorithm [TJ] by a factor of 10. Furthermore, 
whenever all the Xj's converge, they must converge to x* . However, it was not established 
in [16] that they would always converge. 



4 Random Hopwise Averaging 

Deficiencies ID1HD9I facing the existing distributed averaging schemes raise a question: is it 
possible to develop an algorithm, which does not at all suffer from these deficiencies? In this 
section, we construct an algorithm that simultaneously eliminates all but issue lD7] with uncontrolled 
iterations. In the next section, we will modify the algorithm to address this issue. 

To circumvent the costly discretization issue ID l1 facing the existing continuous-time algorithms 
and the clock synchronization and forced transmissions issues lD2] and lD3] facing the existing discrete- 
time synchronous algorithms, the algorithm we construct must be asynchronous, regardless of 
whether the nodes have access to the same global clock. To avoid issue ID6I with overlapping 
iterations, each iteration of this algorithm must involve only a single node sending a single message 
to its one- hop neighbors, without needing them to reply. To tackle issue ID5] with wasted receptions, 
all the neighbors, upon hearing the same message, have to "meaningfully" incorporate it into 
updating their state variables, rather than simply discarding it. To overcome issues |D8] and [D9l with 
steady-state errors and convergence guarantees, the algorithm must be asymptotically convergent 
to the correct average. Finally, to eliminate ID41 it has to avoid computing intermediate quantities. 

To develop an algorithm having the aforementioned properties, consider a networked dynamical 
system, defined on the graph Q = (V,£^) as follows: associated with each link {i,j} S £ are a 
parameter C|j j} > and a state variable 2;|jjj G M of the system. In addition, associated with 
each node i G V is an output variable Xi G M, which represents its estimate of the unknown average 
X* in ([1]). Since the graph Q has L links and N nodes, the system has L parameters c/jji's, L 
state variables X{j^j}'s, and N output variables Xj's. To describe the system dynamics, let Xjj j}(0) 
and Xi{{)) represent the initial values of 2;|j j} and Xj, and X{jj}(/c) and Xi{k) their values upon 
completing each iteration /c G P, where P denotes the set of positive integers. With these notations, 
the state and output equations governing the system dynamics may be stated as 



' EfeA4(fc) C{u{k)/}X{u{k)/}{k 



X{i,j}{k) 



iiu{k) G {i,i}. 



S^eAr„(fe) q«{fc)/} ' ' ' VfcGP, V{i,j}Gf, (6) 

xu^j\{k — 1), otherwise. 



x.(A;) = ?^^^M^M^, V/tGN, VzGV, (7) 

where u{k) € V is a variable to be interpreted shortly and N denotes the set of nonnegative integers. 
Equation ([7]) says that the output variable associated with each node is a convex combination of the 
state variables associated with links incident to the node. Equation 1^ says that at each iteration 
/c G P, the state variables associated with links incident to node u{k) are set equal to the same 
convex combination of their previous values. Equation ([6]) also implies that the system is a linear 
switched system, since © may be written as 

x(fe) = A,(;,)x(fe - 1), VfcGP, (8) 

where ^{k) £ M^ is the state vector obtained by stacking the L xuj\{kys, Ay^tf.\ G M^^^ is a 
time- varying matrix taking one of N possible values Ai, A2, . . . , A^v depending on u{k), and each 
Aj G M is a row stochastic matrix whose entries depend on the cuji^s. Hence, the sequence 
(^(^))fc^i fully dictates how the asynchronous iteration ([6]) takes place, or equivalently, how the 
system ([SD switches. Throughout this section, we assume that (u(/c))^^ is an independent and 
identically distributed random sequence with a uniform distribution, i.e., 

F{u{k) = i} = ^, VfcGP, ViGV. (9) 

Remark 1. Clearly, alternatives to letting {u{k))'^^ be random and equiprobable are possible, and 
perhaps beneficial. We will explore such alternatives in Section O when we discuss control. ■ 

For the system (^, ([7|), Q to solve the distributed averaging problem, the Xi{k)^s must asymp- 
totically approach x* of ([1]), i.e., 

limxi{k)=x*, ViGV. (10) 

fc— >oo 

Due to ([7]), condition pUj) is met if the x^ijy{k)^s satisfy 

lim XuMk) = X*, y{i,j} G £. (11) 

To ensure (llip . the parameters crj^i 's and initial states xr^ji (0)'s must satisfy a condition. To derive 
the condition, observe from ^ that no matter what u{k) is, the expression ^r^ ig^- c^ijjX^ijy{k) 
is conserved after every iteration /c G P, i.e., 

Yl Hi,J}^{iJ}('^) = Yl Hi,J}^{iJ}(^-'^)^ VA;GP. (12) 

Therefore, as it follows from (fT2]) and ([1]), ([TT]) holds only if the Cjj j}'s and X|jj}.(0)'s satisfy 

T.{i,j}e£^{iJ}^{i,j}i^) _ E^ev Vi n o^ 

V^ ~ AT ■ V-L-Jj 



To achieve (fT3|) . notice that the expressions J2{i j}e£'^{i,j} ^^^ J2{i j}e£'^{i,j}^{i,j}^^) each has L 
terms, of which \J\fi\ terms are associated with hnks incident to node i, for every i G V, where | • | 
denotes the cardinahty of a set. Hence, by letting each node i £ V evenly distribute the number 1 
to the |7Vi| terms in Y^^ijy^^c^ij}, i.e., 

C{„-} = ^ + ^' V{^,j}G^, (14) 

we get J2{i j}ee ^{i,j} ~ ^- Similarly, by letting each node i G V evenly distribute its observation 
Vi to the IMI terms in Y.{i,j}e£C{i,j}X{ij}{0), i.e., 

Vi _j_ Vj 

a^{„-}(0) = ^-^' V{i,j}G£:, (15) 

we get ^1^ jg^- CjjjjXjj j}.(0) = J2ieV y«- Thus, (fH|) and ([T5|) together ensure p3]) . which is neces- 
sary for achieving (jlip . 

Remark 2. Obviously, ([H]) and (fT5]) are not the only way to select the c^ijys and X{jj}.(0)'s. In fact, 
their selection may be posed as an optimization problem, analogous to the synchronous algorithms 
in [5l[9]. Nevertheless, ()14p and (llSh have the virtue of being simple and inexpensive to implement: 
for every link {i,j} G £, both c^ijy and X{jj|(0) depend only on local information \J\fi\, \Mj\, yi, 
and Hj that nodes i and j know, as opposed to on global information derived from the graph Q, 
which is typically difficult and costly to gather, but often the outcome of optimization. ■ 

The system dS]), ([7]), ([9]) with parameters IHM and initial states (fT5|) can be realized over the 
wireless network by having the nodes take the following actions: for every link {i,j} S £, nodes 
i and j each maintains a local copy of x^ijj{k), denoted as Xij{k) and Xji{k), respectively, where 
they are meant to be always equal, so that the order of the subscripts is only used to indicate where 
they physically reside. Each node i G V, in addition to Xij{k) \/j £ Mi, also maintains C{jj} \/j G A/i 
and Xi{k). To initialize the system, every node i £ V transmits |A/i| and y, each once, to every 
one-hop neighbor j £ Mi, so that upon completion, each node i £ V can calculate C{j j} \/j £ Mi 
from ^^, Xij{0) Mj £ Mi from ([15]), and Xi(0) from ([7]). To evolve the system, at each iteration 
A; G F, a node u{k) G V is selected randomly and equiprobably based on Q to initiate the iteration. 
To describe the subsequent actions, note that ([6]) and ^ imply: (i) Xu(k)ik) = Xu(^j^^{k — 1); (ii) 
Xu{k)j{k) = x^{k){k) Vj G 7V„(fc); (iii) Xj„(fc)(A;) = x^{^k){k) Vj G 7V„(fc); (iv) Xji{k) = Xji{k - 1) 
yi £ Mj - {u{k)} Vj G AA„(fc); (v) Xj{k) = y. ' ' \/j £ M^f^ky, (vi) xem{k) = xt^{k - 1) 

ym £ Me yi £ V - {{u{k)} U Mu[k)); and (vii) xe{k) = xt{k - 1) V^ G V - {{u{k)} \J Mu[k)) ■ To 
execute (i) and (ii), node u{k), upon being selected to initiate iteration k, sets Xu(k){k) and Xu{k)j{k) 
Vj G Mu(k) all to Xu(k){k — 1). To execute (iii), node u{k) then transmits Xu(k){k) once, to every 
one-hop neighbor j £ Mu(k)-, so that upon reception, each of them can set Xju(k){k) to Xu(k){k)- 
Equations (iv) and (v) say that every neighbor j £ MuCk) experiences no change in the rest of its 



local copies and, hence, can compute Xj{k) from (v) upon finishing (iii). Finally, (vi) and (vii) say 
that the rest of the A^ nodes, i.e., excluding node u{k) and its one-hop neighbors, experience no 
change in the variables they maintain. 

The above node actions define a distributed averaging algorithm that runs iteratively and asyn- 
chronously on the wireless network. We refer to this algorithm as Random Hopwise Averaging 
(RHA) , since every iteration is randomly initiated and involves state variables associated with links 
within one hop of each other. RHA may be expressed in a compact algorithmic form as follows: 

Algorithm 1 (Random Hopwise Averaging). 
Initialization: 

1. Each node i £ V transmits |A/i| and yi to every node j S A/i. 

2. Each node i G V creates variables Xij S R Vj G A/i and £j S M and initializes them sequentially: 



^ ^ Vj e a;, 
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Operation: At each iteration: 

3. A node, say, node i, is selected randomly and equiprobably out of the set V of A^ nodes. 

4. Node i updates Xij Vj G Ai: 
Xij ^ Xi, Vj GTVj. 

5. Node i transmits Xi to every node j G A/i. 

6. Each node j G Ai updates Xji and Xj sequentially: 

Xji i Xj, 

Observe from Algorithm [1] that RHA requires an initialization overhead of 2N real-number 
transmissions to perform Step 1 (the jAil's are counted as real numbers, for simplicity). However, 
each iteration of RHA requires only transmission of a single message, consisting of exactly one 
real number, by the initiating node, in Step 5. Also notice that RHA fully exploits the broadcast 
nature of wireless medium, allowing everyone that hears the message to use it for revising their 
local variables, in Step 6. Therefore, RHA avoids issues ID6I and ID5I with overlapping iterations 
and wasted receptions. Furthermore, as RHA operates asynchronously and calculates the average 
directly, it circumvents issues [DTVID4I with costly discretization, clock synchronization, forced trans- 
missions, and computing intermediate quantities. To show that it overcomes issues [D8l and lD9l with 
steady-state errors and convergence guarantees, consider a quadratic Lyapunov function candidate 
y : M^ ^ M, defined as 

F(x(fc))= Y. q,,,}(x|,,,}(A;)-x*)2. (16) 
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Clearly, V in (fT6]) is positive definite with respect to {x*,x*, . . . ,x*) G R-^, and the condition 

lim F(x(fc)) = (17) 

implies (jlip and thus (jlOp . The following lemma shows that V{x{k)) is always non-increasing and 
quantifies its changes: 

Lemma 1. Consider the wireless network modeled in Section\^ and the use of RHA described in 
AlgorithmUl Then, for any sequence (^t(fe))^x' ^^^ sequence (1^(x(/c)))^q is non-increasing and 
satisfies 

V{^{k))-V{^{k-l)) = - ^ C{u(k),jMu(k),j}{k-l)-x^^k){k-l)f, VA^GP. (18) 

Proof From ^ and the bottom of &, V{^{k))-V{^{k-l)) = - Ej^M^^,^ C|«(fe),i}(-^{..(fc)j}(^) + 
2x{„(fc) j|(/c)a;* + a;? /j^,-, i (A; — 1) — 2x|„(jt)j|(A; — l)a;*) Vfc G P. Due to the top of 1^, the second 
term - EjeA/;,(fe) '^H^{k),j}X{u{k),j}{k)x* cancels the fourth term Y^jeAf^^^) '^H^{k),j}XMk),j}(.k-l)x*. 
Moreover, note from (^ and ([7]) that Xj^j-;.) j}(A;) = Xu[k){k — 1) Vj G J^u(k)- Hence, V{^{k)) — 
y(x(A; - 1)) = - EiGAT^t,) C{„(fc)j}(x2(^)(A: - 1) - 2x„(fc)(A; - l)x{„(fc)j}(/c) +xJ„(;,)_^.|(A: - 1)) VA; G P. 
Due again to the top of ([6]), the second term X^igA^ '^'^{u(k),j}^u(k){k — ^)x{u{k),j}{k) equals 
"Ej^j^f 2c|„(;j) j}2;„(fc)(A; — ^)x^y_i^k),j}ik — 1)- Thus, ([TS]) holds. Since the right-hand side of P^ 
is nonpositive, (^(x(A:)))^q is non-increasing. D 

LemmalDsays that l^(x(A;)) < y(x(fc - 1)) VA; G P. Since y(x(A;)) > Vx(A;) G M^, this implies 
that limfc_j.oo l^(x(A;)) exists and is nonnegative. The following theorem asserts that this limit is 
almost surely zero, so that RHA is almost surely asymptotically convergent to x*: 

Theorem 1. Consider the wireless network modeled in Section\^ and the use of RHA described in 
AlgorithmlJl Then, with probability 1, ()17p . (jlip . and (jlOp hold. 

Proof. By associating the line graph of G with the graph in [10], RHA may be viewed as a special 
case of the algorithm (1) in [lOj . Note from ([6]) and (J14p that the diagonal entries of Aj Vi G V are 
positive, from ([9]) that P{A„(j;.-) = Aj} = -^ VA; G P Vi G V, and from the connectedness of G that 
its line graph is connected. Thus, by Corollary 3.2 of [TO], with probability 1, 3x G M such that 
linik^oo x^ijj{k) = X y{i,j} G £. Due to ([T|), ([T2|), and p^ . x = x*, i.e., ([TT]) holds almost surely. 
Because of ([TO]) and ([ZD, so do ([H]) and ([TO]). D 

As it follows from Theorem [1] and the above, RHA solves the distributed averaging problem, 
while eliminating deficiencies ID1HD9I facing the existing schemes except for ID71 on lack of control. 
Lemma [1] above also says that V in (J16p is a common quadratic Lyapunov function for the linear 
switched system ([8]). This V will be used next to introduce control and remove \D7\ 
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5 Controlled Hopwise Averaging 

5.1 Motivation for Feedback Iteration Control 

RHA operates by executing ([6]) or ([8]) according to {u{k))'^^. Although, by Theorem[Tl almost 
any {u{k))^^^ can drive all the Xi{k)^s in ([7]) to any neighborhood of x* , certain sequences require 
fewer iterations (and, hence, fewer real-number transmissions) to do so than others, yielding better 
bandwidth/energy efficiency. To see this, consider the following proposition: 

Proposition 1. The matrices Ai, A2, . . . , A^r in ^ are idempotent, i.e., A? = Aj Vi S V. 
Moreover, Aj and Aj are commutative whenever {i,j} ^ £, i.e., AjAj = AjAj Vi, j G V, {i,j} ^ £■ 

Proof. Notice from ^ and ([8]) that for any z G V, if x(A;) = Ajx(A: — 1), then xujx{k) Vj G J\fi are 
set equal to the same convex combination of X|jj}(A; — 1) Vj G A/i, and x^p^qj{k) = a^{p,g}(fe — 1) 
y{p,q} G <S — Uj^_^f^{{i,j}}. Thus, Ajx(A;) = x(A;), so that Af = Aj. Moreover, for any i,j G V 
with {i, j} ^ £, because {{i, i} : i £ Mi} n {{j, i} : i £ Mj} = 0, AjA^ = Aj Aj. D 

The idempotence and partial commutativity of Ai,A2,...,A^ from Proposition [H together 
with the fact that the switched system ([8]) may be stated as x(A;) = A^i-^) A„(j!j_i) • • • A„(i)x(0) 
V/c G P, imply that for a given iuik))'^^, the event x(/c) = x(A; — 1) can occur for quite a few 
/j's, each of which signifies a wasted iteration. Furthermore, if the event :x.{k) = x(k — 1) does 
occur for at least one k, then by deleting from {u{k))'^i some of its elements that correspond to 
the wasted iterations, we obtain a new sequence {u^k))"^^ that is more efficient. To illustrate 
these two points, consider, for instance, a 5-node cycle graph with V = {1,2,3,4,5} and £ = 
{{1,2}, {2, 3}, {3, 4}, {4, 5}, {5,1}}. Notice that if (u(A:))^i = (1,1,3,4,1,2,4,5,2,5,...), then as 
many as 5 out of the first 10 iterations — namely, those underlined elements — are wasted. By deleting 
these underlined elements and keeping the rest intact, we obtain a new sequence {u'{k))'^-^ = 
(1,3,4,2,5, . . .) that is 5 real-number transmissions more efficient than ('u(/c))^^. 

The preceding analysis shows that RHA is prone to wasteful iterations, which is a primary 
reason why certain sequences are more efficient than others. RHA, however, makes no attempt 
to distinguish the sequences, as it lets every possible {u{k))'^^^ be equiprobable, via Q- In other 
words, it does not try to control how the asynchronous iterations occur and, thus, suffers from lDTl 

Remark 3. Wasteful iterations incurred by idempotent and partially commutative operations are 
not an attribute unique to RHA, but one that is shared by Pairwise Averaging |12| . Anti-Entropy 
Aggregation |13lll4j . Randomized Gossip Algorithm [15j, and Distributed Random Grouping ^7\ 
(indeed, the examples provided in ID7I against the latter two algorithms were created from this 
attribute). What is different is that in this paper, we view the attribute as a limitation and find 
ways to overcome it, whereas in |12H15t [T7]. the attribute was not viewed as such. ■ 

One way to control the iterations, alluded to in Remark [U is to replace Q with a general 
distribution P{u{k) = i} = pj Vfc G P Vi G V and then choose the pj's to maximize efficiency, before 
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any averaging task begins. This approach, however, has an inherent shortcoming: because the pj's 
are optimized once-and-for-ah, they are constant and do not adapt to x{k) during runtime. Hence, 
optimal or not, the pj's almost surely would produce inefficient, wasteful (u(A;))^]^. The fact that 
the nodes do not adjust the pj's based on information they pick up during runtime also suggests 
that this way of controlling the iterations may be considered open loop. 

The aforementioned shortcoming of open-loop iteration control raises the question of whether it 
is possible to introduce some form of closed-loop iteration control as a means to generate efficient, 
non-wasteful {u{k))'^-^^. Obviously, to carry out closed-loop iteration control, feedback is needed. 
Due to the distributed nature of the network, however, feedback may be expensive to acquire: if an 
algorithm demands that the feedback used by a node be a function of state variables maintained 
by other nodes, then additional communications are necessary to implement the feedback. Such 
communications can produce plenty of real-number transmissions, which must all count toward 
the total real-number transmissions, when evaluating the algorithm's bandwidth/energy efficiency. 
Thus, in the design of feedback algorithms, the cost of "closing the loop" cannot be overlooked. 

In this section, we first describe an approach to closed-loop iteration control, which leads to 
highly efficient and surely non-wasteful {u{k))^^ at zero feedback cost. Based on this approach, 
we then present and analyze two modified versions of RHA: an ideal version and a practical one. 

5.2 Approach to Feedback Iteration Control 

Note that with RHA, (m(A;))^^ is undefined at the moment an averaging task begins and is 
gradually defined, one element per iteration, as time elapses, i.e., when a node i £ V initiates an 
iteration k € ¥, the element u{k) becomes defined and is given by u{k) = i. Thus, by controlling 
when to initiate an iteration, the nodes may jointly shape the value of {u{k))^-^^. With RHA, this 
opportunity to shape {u{k))^^^ is not utilized, as the nodes simply randomly and equiprobably 
decide when to initiate an iteration. To exploit the opportunity, suppose henceforth that the nodes 
wish to control when to initiate an iteration using some form of feedback. The questions are: 

Ql. What feedback to use, so that the corresponding feedback cost is minimal? 
Q2. How to control, so that the resulting {u{k))^^ is highly efficient? 
Q3. How to control, so that the resulting {u{k))'^^-^ is surely non-wasteful? 



To answer questions |Qlf[Q3[ we first show that RHA, along with the common quadratic Lya- 



punov function V of (J16p . exhibits the following features: 

Fl. Although the nodes never know the value of V, every one of them at any time knows by how 

much the value would drop if it suddenly initiates an iteration. 
F2. The faster (u^k))'^-^ makes the value of V drop to zero, the more efficient it is. 
F3. If the value of V does not drop after an iteration, then the iteration is wasted, causing 

{u{k))^^i to be wasteful. The converse is also true. 
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The first part of feature IFll can be seen by noting that V{x{k)) in p6|) depends on C{jj} 
V{i,j} G £, Xjj j}(A;) V{i,j} G <?, and x* , whereas each node i £ V only knows c^ij} Vj G A/i and 
X{jj}(A;) Vj G AAj. To see the second part, suppose a node i G V initiates an iteration A; G P at 
some time instant t, so that u{k) = i by definition. Observe from Lemma [1] that whoever node 
u{k) is, upon completing this iteration, the value of V would drop from V{'x{k — 1)) to V{x{k)) by 
an amount equal to the right-hand side of (fTSl) . To compactly represent this drop, for each i £ V 
let AVi : R^ — )• M be a positive semidefinite quadratic function, defined as 

AVii^ik)) = Y^ C{.,j^{x{ij}{k) - Xi{k))\ yk G N, (19) 

where Xi{k) is as in ([7]). Then, with (|19p . (jlSp may be written as 

y(x(A;))-y(x(A;-l)) = -AK(fc)(x(A:-l)), VA: G P, (20) 

where AVy^(^]^^{x{k — 1)) in (f^U|) represents the amount of drop, i.e., 

AV;(fc)(x(A:-l)) = J]] C|„(fc)j}(x{^(fc)j}(A:-l)-x„(fc)(A;-l))2, V/c G P. (21) 

Notice that AV!^a)(x(A; — 1)) in (j2ip depends on parameters and variables maintained by node u{k), 
whose values are known to node u{k) prior to iteration k at time t. Therefore, before initiating this 
iteration at time t, node u{k) already knows that the value of V would drop by AVu(k){^ik — 1)). 
Since t, k, and u{k) are arbitrary, this means that every node i G V at any time knows by how 
much the value of V would drop if it suddenly initiates an iteration (i.e., by AVi(x(-))). This 
establishes feature IFli To show feature IF21 recall that: (i) V{x{k)) in (fTUj) is a measure of the 
deviation of the x^j j}(A:)'s from x*; (ii) the Xi{k)'s in ([7]) are convex combinations of the X{jj}.(A;)'s; 
(iii) bandwidth/energy efficiency is measured by the number of real-number transmissions needed 
for all the Xi{k)^s to converge to a given neighborhood of x*; and (iv) RHA in Algorithm [T] has a 
fixed, one real-number transmission per iteration. Hence, the faster {u{k))'^-^ drives V{x{k)) to 
zero, the faster it drives the X{j j}(A:)'s and Xj(/c)'s to x* (due to (i) and (ii)), and the more efficient 
it is (due to (iii) and (iv)). Finally, to show feature IFSJ suppose V{x{k)) = V{x{k — 1)) after an 
iteration A; G P. Then, it follows from (|20p that AVy_ri^\{x(k — 1)) = 0, from (|2ip that 2:{u(fc)j}(A; — 1) 
Vj G Nuik) fire equal, and from ([6]) that x(A;) = ii.{k — 1). Thus, iteration k is wasted. The converse 
is also true, as x.[k) = x{k — 1) implies V{x{k)) = V{x{k — 1)). 

Having demonstrated features iFltiFSl we now use them to answer questions |Q1HQ3[ Feature [FT] 



suggests that every node i £ V may use AVi(x(-)), which it always knows, as feedback to control, 
on its own, when to initiate an iteration. As the feedbacks AVi(x(-))'s are locally available and the 
control decisions are made locally, the resulting feedback control architecture is fully decentralized, 



requiring zero communication cost to realize. Therefore, an answer to question Ql is: 



Al. Each node i G V uses AVi{x{-)) as feedback to control when to initiate an iteration. 
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Feature [F2] suggests that, to produce highly efficient (it(fc))^i, the nodes may focus on making 
the value of V drop significantly after each iteration, especially initially. In other words, they may 
focus on letting every iteration be initiated by a node i with a relatively large AVi{x(-)). With 
architecture lAH this may be accomplished if nodes with larger Al^(x(-))'s would rush to initiate, 
while nodes with smaller Al^(x(-))'s would wait longer. Hence, an answer to question Q2 is: 



A2. The larger AT^(x(-)) is, the sooner node i initiates an iteration (i.e., the smaller AVi{x(-)) is, 
the longer node i waits). 

Finally, feature lF3l suggests that, to generate surely non-wasteful (u(fc))^-|^, the value of V must 
strictly decrease after each iteration. With architecture lAH this can be achieved if nodes with zero 
AT^(x(-))'s would refrain from initiating an iteration. Thus, an answer to question Q3 is: 



A3. Whenever AV^(x(-)) = 0, node i refrains from initiating an iteration. 

Answers lAlHASI describe a greedy, decentralized approach to feedback iteration control, where 
potential drops AVi(x(-))'s in the value of V are used to drive the asynchronous iterations. This 
approach may be viewed as a greedy approach because the nodes seek to make the value of V 
drop as much as possible at each iteration, without considering the future. Because the nodes 
also seek to fully exploit the broadcast nature of every wireless transmission (a feature inherited 
from Steps 5 and 6 of RHA), this approach strives to "make the most" out of each iteration. Note 
that although Lyapunov functions have been used to analyze distributed averaging and consensus 
algorithms (e.g., in the form of a disagreement function pQ or a set-valued convex hull [24]). their 
use for controlling such algorithms has not been reported. Therefore, this approach represents a 
new way to apply Lyapunov stability theory. 

5.3 Ideal Version 

In this subsection, we use the aforementioned approach to create an ideal, modified version of 
RHA, which possesses strong convergence properties that motivate a practical version. 

The above approach wants the nodes to try to be greedy. Thus, it is of interest to analyze an 
ideal scenario where, instead of just trying, the nodes actually succeed at being greedy, ensuring 
that every iteration A; G P is initiated by a node z E V with the maximum AVi{x(k — 1)), i.e., 

u{k) earg max AVi{x{k-l)), \/keF, (22) 

so that V{x(k — 1)) drops maximally to V{x{k)) for every k £ ¥. Notice that (|22p does not 
always uniquely determine u{k): when multiple nodes have the same maximum, u{k) may be any 
of these nodes. Although u{k) can be made unique (e.g., by letting u{k) be the minimum of 
argmaXjgy AVi(x(A; — 1))), in the analysis below we will allow for arbitrary u{k) satisfying (|22p . 
Also note that in the rare case where AVi{x{k* — 1)) = Vi G V for some k* G P, due to ([1]), 
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([T2|), (fT3]) . (fT9]) . and the connectedness of the graph ^, we have X|jj}(A;* — 1) = x* V{i, j} G <? and 
a;j(A;* — 1) = x* Vi G V, thereby solving the problem in finite time. Furthermore, due to IA31 all the 
nodes would refrain from initiating iteration k* (and beyond), thereby terminating the algorithm 
in finite time and causing x^ijj{k) V{i,j} G £, Xi{k) Vi G V, u{k)^ and F(x(/c)) to be undefined 
\/k > k* . In the analysis below, however, we will allow the algorithm to keep executing according 
to ([2^ . so that X{jj}.(fc) V{i,j} G £, Xi{k) Vi G V, u(A;), and V{^{k)) are defined Vfc. 

Equation ([22]) , together with ([6]) , ([7]) , (fHj) , (fT5]) , and (fT9|) , defines a networked dynamical system 
that switches among A'^ different dynamics, depending on where the state is in the state space, i.e., 
if x(fc - 1) is such that lWi{^{k - 1)) > Ayj(x(/c - 1)) V j G V - {i}, then x(A;) = Ai^{k - 1). 
This system may be expressed in the form of an algorithm — which we refer to as Ideal Controlled 
Hopwise Averaging (ICHA) — as follows: 

Algorithm 2 (Ideal Controlled Hopwise Averaging). 
Initialization: 

1. Each node i G V transmits \Mi\ and yi to every node j G A/i. 

2. Each node z G V creates variables Xij G M Vj G A/i, Xj G M, and AVi G [0, oo) and initializes 
them sequentially: 

W7\ Wi\ 1 ; • 1 r 



Operation: At each iteration: 

3. Let i G argmaXjgy AV,-. 

4. Node i updates Xij Vj G A/i and AVi sequentially: 
Xij <- Xi, Wj G TVi, 

AT/j ^ 0. 

5. Node i transmits Xi to every node j G A/i. 

6. Each node j G Ai updates Xji, Xj, and AVj- sequentially: 

Xji ^ Xi^ 

Algorithm[2l or ICHA, is identical to RHA in Algorithm [1] except that each node i also maintains 
AVi, in Steps 2, 4, and 6, and that each iteration is initiated by a node i experiencing the maximum 
AVi, in Step 3. Note that "AV^ ^ 0" in Step 4 is equivalent to "AFj ^ EjeM ^{ij}(^ii ~ ^«)^" 
since Xjj Vj G Ai and Xj are equal at that point. The fact that AVi goes from being the maximum 
to zero whenever node i initiates an iteration also suggests that it may be a while before AVi 
becomes the maximum again, causing node i to initiate another iteration. 

The convergence properties of ICHA on general networks are characterized in the following 
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theorem, in which 1„ E M" and x(A;) G M^ denote, respectively, the vectors obtained by stacking 
n I's and the A^ Xi(A;)'s: 

Theorem 2. Consider the wireless network modeled in Section\^ and the use of ICHA described 
in Algorithmic Then, 

y(x(fc))<(l-i)F(x(A;-l)), 



V(x(0))maxigv|M|, 



x(A:) - X*1l\\ < V ^^^^ 2 (1 



|x(A;) — x*l7v|| < 



2V(x(0))maxi6v|A/;i 



Hie 



V |M|+maxigv |M 



VfceP, 


(23) 


-i)'=/2, vfceN, 


(24) 


^(l-i)'=/2, VfcGN, 


(25) 



where j e [f + I, N^ - 2iV^ + y + 1] is ^iwen 6?/ 

7 = y + « + ^ , (26) 

and where a = max{,,,|g^ ^ G [1, ^'-f+^ ], /? = E^eV E,eMu{.} ^^^.^ G [iV + |(1 + 7v^)^ N% 

hi = \ Yliji^Mi ^{i,j} ^^ G ''^' ^'^'^ ^ ^'^ ^^^ network diameter. 

Proof. See Appendix lA.il D 

Theorem [2] says that ICHA is exponentially convergent on any network, ensuring that y(x(A;)), 
||x(A:) — x*1l||, and ||x(A;) — x*lAr|| all go to zero exponentially fast, at a rate that is no worse than 
1 — i or (1 — -)^'^, so that 7 in (|26p represents a bound on the convergence rate. It also says that 
the bound 7 is between Q,{N) and 0{N^) and depends only on A^, D, and the |A/i|'s, making it 
easy to compute. The following corollary lists the bound 7 for a number of common graphs: 

Corollary 1. The constant 7 in (|26p becomes: 

Gl. 7 = Af3 _ 4^2 _^ I^Y + I /or a path graph with N >5, 

G2. 7 = I A^3 _ ^^2 _ 1 jv _^ ^ if N is odd and 7 = | A^^ _ n ^2 _ 5 ^ _^ ^ if N is even for a 

cycle graph, 
G3. j = § + K+ iN~K~i)i3iN-i)-D)iD+i) ^^^ ^ K -regular graph with K>2, 

G4- 7 = 2-^ ~ 1 for o- complete graph. 

Proof. For a path graph with A^>5,a = |,/3 = 3N — 1, and D = N — I. For a cycle graph, 
a = 2, (3 = 3N, D = ^ if A^ is odd, and D = f if A^ is even. For a AT-regular graph with K >2, 
a = K and /3 = N{K + 1). For a complete graph, a = N -I and (3 = N^. Hence, EIHSl hold. D 

Each bound 7 in Corollary [1] is obtained by specializing ()26|) for arbitrary graphs to a specific one. 
Conceivably, tighter bounds may be obtained by working with each of these graphs individually, 
exploiting their particular structure. Theorem [3] below shows that this is indeed the case with path 
and cycle graphs (6 and 15 times tighter, respectively), besides providing additional bounds for 
regular and strongly regular graphs: 
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Figure 1: Comparison between the stochastic convergence rate 1 of PA and the deterministic 
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on convergence rate of ICHA for path, cycle, and complete graphs. 



Theorem 3. Consider the wireless network modeled in Section\^ and the use of ICHA described 
in AlgorithmlM Then, ([23]) -([25]) hold with: 
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A^ + 3 for a path graph with A > 4, 



24 ~^ T2^ ~'^~^ iw ^f ^ ^"^ '^'^'^ '^^'^ ^ 



2r + iA-3 + .^ 



+ K -\ Tj for a K -regular graph with K >2, 



SI 7 = | + i^ + MM±?)(izi^^ 



M 



Proof. See Appendix IA.2i 



-^ if N is even for a cycle graph, 
K 
for a (A, K, A, ^) -strongly regular graph with /i > 1. 

D 



Recently, |T0] studied, among other things, the convergence rate of Pairwise Averaging (PA) |12j . 
The results in [10] are different from those above in three notable ways: first, the convergence rate 
of PA is defined in [10] as the decay rate of the expected value of a Lyapunov-like function d{k). 
Although this stochastic measure captures the average behavior of PA, it offers little guarantee 
on the decay rate of each realization (d(A;))^Q. In contrast, the bounds 7 on convergence rate of 
ICHA above are deterministic, providing guarantees on the decay rate of (F(x(A;)))^q. Second, 
even if the first difference is disregarded, the bounds of ICHA are still roughly 20% better than 

the convergence rate of PA for a few common graphs. To justify this claim, let 1 denote the 

convergence rate of PA. Since PA requires two real-number transmissions per iteration while ICHA 
requires only one, to enable a fair comparison we introduce a two-iteration bound 71CHA for ICHA, 



defined as 7icha 



so that 1 



(1 — ^) . Figure dj plots the ratio '^]F^^ versus N for 



27-I -— - 71CHA ^ 7' ■ '" """^ "" 7PA 

path, cycle, and complete graphs, where 7pa is computed according to [lOj, while 71CHA is computed 
using 7 in [STl [S2l and IG4[ Observe that for N > 50, 71CHA is 18% smaller than 7pa for path and 
cycle graphs, and 25% so for complete graphs. The latter can also be shown analytically: since 



7PA = A^ - 1 and 71CHA 



(|7V-1)2 



lim 



N~ 



7ICHA 



This justifies the claim. Finally, unlike 



2(|Af-l)-l' "^"^''-'°° 7PA 4- 

7 and 71CHA) 7pa in general cannot be expressed in a form that explicitly reveals its dependence on 
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the graph invariants. Indeed, it generally can only be computed by numerically finding the spectral 
radius of an invariant subspace of an A^^-by-A^^ matrix, which may be prohibitive for large A'^. 

5.4 Practical Version 

The strong convergence properties of ICHA suggest that its greedy behavior may be worthy of 
emulating. In this subsection, we derive a practical algorithm that closely mimics such behavior. 

Reconsider the system ([6]), ([7]), (flil) . (fT5]l and suppose this system evolves in a discrete event 
fashion, according to the following description: associated with the system is time, which is real- 
valued, nonnegative, and denoted as t G [0,c«), where t = represents the time instant at which 
the nodes have observed the y^'s but have yet to execute an iteration. In addition, associated with 
each node i G V is an event, which is scheduled to occur at time tj G (0, oo] and is marked by 
node i initiating an iteration, where tj = oo means the event will not occur. Each event time tj 
is a variable, which is initialized at time t = to tj(0), is updated only at each iteration A; G P 
from Ti{k — 1) to Ti{k), and is no less than t at any time t, so that no event is scheduled to 
occur in the past. Starting from t = 0, time advances to t = minjgv''"i(0), at which an event, 
marked by node u{l) G argmin^gy Tj(0) initiating iteration 1, occurs, during which Tj(1) Vi G V are 
determined. Time then advances to t = minjgy '^ii^)^ at which a subsequent event, marked by node 
u{2) G argmiUjgy Ti(l) initiating iteration 2, occurs, during which Tj(2) Vi G V are determined. In 
the same way, time continues to advance toward infinity, while events continue to occur one after 
another, except if Ti{k) = oo Vz G V for some A; G N, for which the system terminates. 

Having described how the system evolves, we now specify how Ti{k) V/c G N Vi G V are recursively 
determined. First, consider the time instant t = 0, at which t,(0) Vi G V need to be determined. 
To behave greedily, nodes with the maximum AVi(x(0))'s should have the minimum rj(0)'s. This 
may be accomplished by letting 

7^(0) = ^(Ay,(x(0))), Vi G V, (27) 

where $ : [0, oo) — t- (0, oo] is a continuous and strictly decreasing function satisfying lim^^o ^{v) = 
oo and <I*(0) = oo. Although, mathematically, ()27p ensures that y(x(0)) drops maximally to 
y(x(l)), in reality it is possible that multiple nodes have the same minimum rj(0)'s, leading to 
wireless collisions. To address this issue, we insert a little randomness into (j27p . rewriting it as 



r,(0) = $(Ay,(x(0))) + e(Ay,(x(0))) • rand(), Vi G V, (28) 

where e : [0, oo) — )• (0, oo) is a continuous function meant to take on small positive values and 
each call to rand() returns a uniformly distributed random number in (0, 1). With ()28p . with high 
probability iteration 1 is initiated by a node i with the maximum, or a near-maximum, AVi(x(0)). 
Next, pick any k £ ¥ and consider the time instant t = minify Ti{k — 1), at which node 
u{k) G argmiujgy rj(/i; — 1) initiates iteration k, during which Ti{k) Wi £ V need to be determined. 
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Again, to be greedy, nodes with the maximum AVi{x{k)ys should have the minimum Ti{kys. At 
first glance, this may be approximately accomplished following ideas from ()28p . i.e., by letting 



Tiik) = <^iAViix{k))) + eiAViix{k))) ■ rand(), Vz G V. (29) 

However, with (|29|) . it is possible that Ti{k) turns out to be smaller than t, causing an event to 
be scheduled in the past. Moreover, nodes who are two or more hops away from node ■u(fc) are 
unaware of the ongoing iteration k and, thus, are unable to perform an update. Fortunately, these 
issues may be overcome by slightly modifying (j29p as follows: 



lmax{$(Ay,(x(fc))),t}+e(Ay,(x(fc))).rand(), if i G AA„(fc) U KA;)}, 
Ti{k) = < Vz G V. (30) 

I Tj(A; — 1), otherwise. 

Using ()28p and (j30p and by induction on k' G P, it can be shown that Ti{k') satisfies 

max{$(AFi(x(A;'))),t'} < Ti{k') < max{^{AVi{j^{k'))),t'} + e{AV^{x{k'))), VA:' G P, Vi G V, 

where t' = min^gv tj {k' — 1). Hence, with (I30p , it is highly probable that iteration A: + 1 is initiated 
by a node i with the maximum or a near-maximum AVi{x(k)). It follows that with (|28p and 
(|30p . the nodes closely mimic the greedy behavior of ICHA. Note that (j28p and (j30p represent a 
feedback iteration controller, which uses architecture lAll and follows the spirit of IA2I (since $ is 
strictly decreasing and e is small) and IA3I (since ^(0) = cxd). Also, $ and e represent the controller 
parameters, which may be selected based on practical wireless networking considerations (e.g., all 
else being equal, ^{v) = ^ and e{v) = 0.001 yield faster convergence time than ^{v) = ^ and 
e(u) = 0.01 but higher collision probability). 

The above description defines a discrete event system, which can be realized via a distributed 
asynchronous algorithm, referred to as Controlled Hopwise Averaging (CHA) and stated as follows: 

Algorithm 3 (Controlled Hopwise Averaging). 
Initialization: 

1. Let time t = 0. 

2. Each node i G V transmits \J\fi\ and yi to every node j G A/^. 

3. Each node i G V creates variables Xij G M Vj G A/i, Xi G M, AVi G [0, oo), and tj G (0, oo] and 
initializes them sequentially: 



^ ^ Vj G M., 



^3 I 



Xi 






Ti^^{AVi) + e{AVi)-vand{). 
Operation: At each iteration: 

4. Let t = uiiiij^yTj and i G argmin^gyTj. 
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5. Node i updates Xij Vj G Mi, AVi, and tj sequentially: 

Xij ^ Xi, Wj gTVj, 
AV^ ^ 0, 

Tj ■<— OO. 

6. Node i transmits xt to every node j G A/i. 

7. Each node j G Mi updates Xji, Xj, AVj, and Tj sequentially: 



T^eeAfi <^{i,i}^. 



H 



Tj ^ max{$(AV,),i} + e{AVj) ■ rand(). 



Algorithm O or CHA, is similar to ICHA in Algorithm [2] except that each node i maintains 
an additional variable Tj, in Steps 3, 5, and 7, and that each iteration is initiated, in a discrete 
event fashion, by a node i having the minimum r,, in Step 4. Note that "tj -^ oo" in Step 5 is 
due to "AT^ -^ 0" and to ^{0) = oo. Moreover, every step of CHA is implementable in a fully 
decentralized manner, making it a practical algorithm. 

To analyze the behavior of CHA, recall that e is meant to take on small positive values, creating 
just a little randomness so that the probability of wireless collisions is zero. For the purpose of 
analysis, we turn this feature off (i.e., set e{v) = \/v £ [0,oo)) and let the symbol "G" in Step 4 
take care of the randomness (i.e., randomly pick an element i from the set argmin-gy tj whenever 
it has multiple elements). We also allow $ to be arbitrary (but satisfy the conditions stated when it 
was introduced). With this setup, the following convergence properties of CHA can be established: 

Theorem 4. Theorems {^ andl^ intended for ICHA described in Algorithmic hold verbatim for 
CHA described in Algorithmic with any $ and with e satisfying e{v) = Vt; G [0,oo). In addition, 
limfc_^ooi(^) = oo and V{x{k)) < (7 - l)$~^(t(A;)) V/c G P, where t(0) = and t{k) is the time 
instant at which iteration k occurs. 

Proof. See Appendix lA. 31 D 

Theorem U] characterizes the convergence of CHA in two senses: iteration and time. Iteration- 
wise, it says that CHA converges exponentially and shares the same bounds 7 on convergence rate 
as ICHA, regardless of $. This result suggests that CHA does closely emulate ICHA. Time- wise, 
the theorem says that CHA converges asymptotically and perhaps exponentially, depending on $. 
For example, $(u) = - does not guarantee exponential convergence in time (since $~^(f) = -), 
but $(v) = W{-), where W is the Lambert W function, does (since ^~^{v) = -e~^). Therefore, 
the controller parameter $ may be used to shape the temporal convergence of CHA. 

Remark 4. CHA has a limitation: it assumes no clock offsets among the nodes. Note, however, 
that although such offsets would cause CHA to deviate from its designed behavior, they would not 
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render it "inoperable," i.e., V{x{k)) would still strictly decrease after every iteration k, and the 
conservation (I12p would still hold, so that the x^ijy{k)^s and Xj(A;)'s would still approach x*. 

6 Performance Comparison 

In this section, we compare the performance of RHA and CHA with that of Pairwise Averaging 
(PA) [12j, Consensus Propagation (CP) [H], Algorithm A2 (A2) of [H], and Distributed Random 
Grouping (DRG) |,17J via extensive simulation on multi-hop wireless networks modeled by random 
geometric graphs. For completeness, PA, CP, A2, and DRG are stated below, in which £' = {{i,j) G 
V X V : {i,j} € £} denotes the set of 2L directed links: 

Algorithm 4 (Pairwise Averaging [T2]). 
Initialization: 

1. Each node i £ V creates a variable Xj G M and initializes it: Xi ■^ yi. 
Operation: At each iteration: 

2. A link, say, link {i,i}, is selected randomly and equiprobably out of the set £ oi L links. 

Node i transmits Xi to node j. Node j updates Xj: Xj < ^-g"^- Node j transmits Xj to node 

i. Node i updates xf Xi -^ Xj. ■ 

Algorithm 5 (Consensus Propagation [18]). 
Initialization: 

1. Each node i £ V creates variables Kji > Vj G A/i, fiji G M Vj G A/i, and Xj G M and 
initializes them sequentially: Kji -^ Vj G A/i, /Xjj -^ Vj G A/i, Xj -^ ?/i. 

Operation: At each iteration: 

2. A directed link, say, link («, j), is selected randomly and equiprobably out of the set £' of 
2L directed links. Node i transmits Fa = - — t— — !^ ^'''^^ — ^— — and Gjj = ', , J.^ ''^^^ — fr — - 
to node j. Node j updates Kij, fiij, and Xj sequentially: Kij ^ Fij, fiij ^ Gij, Xj ^ 

Algorithm 6 (Algorithm A2 [19]). 
Initialization: 

1. Each node i £ V creates variables 6ij G M Vj G A/i and Xj G M and initializes them sequentially: 
5ij ^ V j G Mi, Xi ^ Hi. 

Operation: At each iteration: 

2. A directed link, say, link {i,j), is selected randomly and equiprobably out of the set £' of 2L 
directed links. Node i transmits Xj to node j. Node j updates 5ji: 5ji ^ 6ji + 0(xj — Xj). 
Node j transmits 0(xj — Xj) to node i. Node i updates 6ij: 6ij ■<— 5ij — (t){xi — Xj). Each node 
£ £V updates xf. X£ ^ X£ + pv^((EmGA/i ^^rn) + yi- Xi). ■ 
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Figure 2: A 100-node, 1000-link multi-hop wireless network. 

Algorithm 7 (Distributed Random Grouping [17j). 

Initialization: 

1. Each node i £ V creates a variable Xj € M and initializes it: Xi ^ yi. 
Operation: At each iteration: 

2. A node, say, node i, is selected randomly and equiprobably out of the set V of A^ nodes. 
Node i transmits a message to every node j £ Mi, requesting their Xj's. Each node j G Mi 



transmits Xj to node i. Node i updates xf. Xi 
node j G A/i. Each node j G Mi updates Xj: xj 



E 



■jg{i}uAAi^i 
IMI + 1 



. Node i transmits Xi to every 



Note that RHA and CHA require 2N real-number transmissions as initialization overhead, 
whereas PA, CP, A2, and DRG require none. However, PA, CP, and A2 require two real-number 
transmissions per iteration and DRG requires \Mi \ + 1 (where i is the node that leads an iteration) , 
whereas RHA and CHA require only one. Also note that CP has a parameter /3 G (0, oo] and A2 
has two parameters 7 G (0, 1) and 4> G (0, ^). Moreover, PA and DRG are assumed to be free of 
overlapping iterations, i.e., deficiencv ID6I 

To compare the performance of these algorithms, two sets of simulation are carried out. The first 
set corresponds to a single scenario of a multi-hop wireless network with N = 100 nodes, where each 
node i observes t/j G (0, 1) and has, on average, ^ = 20 one-hop neighbors, as shown in Figure El 
The second set corresponds to multi-hop wireless networks modeled by random geometric graphs, 
with the number of nodes varying from N = 100 to A^ = 500, and the average number of neighbors 
varying from -jj- = 10 to -^ = 60. For each N and -jj-, we generate 50 scenarios. For each scenario, 
we randomly and uniformly place N nodes in the unit square (0,1) x (0,1), gradually increase 
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Figure 3: Convergence of the estimates Xi{k)'s to the unknown average x* under PA, CP, A2, DRG, 
RHA, and CHA for the network in Figure [2l 

the one- hop radius until there are L hnks (or -^ neighbors on average), randomly and uniformly 
generate the yj's in (0, 1), and repeat this process if the resulting network is not connected. We then 
simulate PA, CP, A2, DRG, RHA, and CHA until 3A^^ real-number transmissions have occurred 
(i.e., three times of what flooding needs), record the number of real- number transmissions needed 
to converge (including initialization overhead, if any), and assume that this number is 3N'^ if an 
algorithm fails to converge after 3A^^. For both sets of simulation, we let the convergence criterion 
be |xj — x*| < 0.005 Vi G V and the parameters be /3 = 10^ for CP (obtained after some tuning), 
7 = 0.3 and (/) = 0.49 for A2 (ditto), and ^{v) = I and e{v) = 0.001 for CHA. 

Results from the first set of simulation are shown in Figure [3l Observe that PA and A2 
have roughly the same performance, requiring approximately 7, 000 real-number transmissions to 
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converge. In contrast, CP fails to converge after 10, 000 transmissions, although it does achieve 
a consensus. On the other hand, DRG is found to be quite efficient, needing only approximately 
2, 100 transmissions for convergence. Note that RHA outperforms PA, CP, and A2, but not DRG, 
while CHA is the most efficient, requiring only roughly 1,300 transmissions to converge. 

Results from the second set of simulation are shown in Figure [H where the number of real- 
number transmissions needed to converge, averaged over 50 scenarios, is plotted as a function of 
the number of nodes A^ and the average number of neighbors -^. Also included in the figure, as 
a baseline for comparison, is the performance of flooding (i.e., N'^). Observe that regardless of N 
and -jj-, CP has the worst bandwidth/energy efficiency, followed by PA and A2. In addition, DRG, 
RHA, and CHA are all fairly efficient, with CHA again having the best efficiency. In particular, 
CHA is at least 20% more efficient than DRG, and around 50% more so when the network is sparsely 
connected, at -^ = 10. Notice that the performance of DRG is achieved under the assumption that 
overlapping iterations cannot occur, a condition that CHA does not require. Finally, the significant 
difference in efficiency between RHA and CHA demonstrates the benefit of incorporating greedy, 
decentralized, feedback iteration control. 

7 Conclusion 

In this paper, we have shown that the existing distributed averaging schemes have a few draw- 
backs, which hurt their bandwidth/energy efficiency. Motivated by this, we have devised RHA, an 
asynchronous algorithm that exploits the broadcast nature of wireless medium, achieves almost sure 
asymptotic convergence, and overcomes all but one of the drawbacks. To deal with the remaining 
drawback, on lack of control, we have introduced a new way to apply Lyapunov stability theory, 
namely, the concept of greedy, decentralized, feedback iteration control. Based on this concept, we 
have developed ICHA and CHA, established bounds on their exponential convergence rates, and 
shown that CHA is practical and capable of closely mimicking the behavior of ICHA. Finally, we 
have shown via extensive simulation that CHA is substantially more bandwidth/energy efficient 
than several existing schemes. 

Several extensions of this work are possible, including design and analysis of "controlled" dis- 
tributed averaging algorithms that are applicable to more general wireless networks (e.g., with 
directed links, time-varying topologies, and dynamic observations) and more realistic communica- 
tion channels (e.g., with random delays, packet losses, and quantization effects), and that take into 
account MAC/PHY layer design issues (e.g., retransmission and backoff strategies). 
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A Appendix 

A.l Proof of Theorem [2] 

To prove Theorem [2l we first prove the following lemma: 
Lemma 2. V{'x.{k)) < j uiaxi(z\; AVi{x{k)) \/k G N, where 7 is as in ([26 



Proof. Let /c G N. Notice from ([H]) that Eiev ^i = ^ ^nd from ([H), ([7]), ([l2]), and ([T3]) that 
Eiev^»^i(^) = EiGV^i^*- Thus, EievEigv^i^i(^i(^)-%(^))^ = Ejev^i EiGV^«(^iW-^*)^ + 

It follows from ([H]), ([IS]), and © that 



^(x(fc)) = ^ X] IZ C{»j}(^{m}(^) - ^i{k)f + :^^Y1 Hi,j}(^i(^) 



*\2 



+ ^(x,(A;) - X*) 5] C{,,,|(x|,,,}(fc) - x,(A;)) = ^ J] AV.(x(A;)) + J] 6,(x,(A;) - x*)^ (31) 
< f max Ay,(x(A:)) + g^.vE.,evM^(^^W-^.(^))^ 

Note from ([ED that iVmaxigvAyi(x(A;)) > Y^ieV^i^^ii^^)) = Efejef &iC|i,j}(^i(^)-a^{U}(^))^+ 
^jC{i,j}(%(fc) - 2;{ij}(A:))2 > E{jj}e^ ^£^{xi{k) - Xj{k)f. Hence, 

y y bibjixiik) - Xj{h)f < 2a7VmaxAyi(x(/i;)). (33) 

Next, it can be shown via ([T9]l that Vi G V with |A/i| > 2, Vj, £ G A/i with j / I, C{j j}C{j£}(x{j j}(A;) — 
a;{i,n(^))^ - (c{ij}+C{i/})(c{jj}(x{ij}(A;)-Xi(/c))2+C{i^^}(x{i^|.}(/c)-Xi(A;))2) < (c{ij}+C{j,f})Ayj(x(/c)), 
implying that |x{jj}(A:) - X{i^i,}{k)\ < (maxpgv Ayp(x(A;))(^^ + ^^))^- In addition, Vi G V, 
Vj G M, |xi(A;) - X|i, 1(^)1 < p'^^P6vAVp(x(fc)) xi ^g^^^gg Qf (ijgj)^ Pq^ ^j^y -^^^ g y ^-^j^ ■ _^ -^ 

let the sequence (ai, a2, . . . , 0^.^. ) represent a shortest path from node i to node j, where ai = i, 
a, 



2-^. = j, {a^, a^+i} G <? V^ G {1, 2, . . . , ttijj — 1}, and 2 < ttijj < D + 1. Then, it follows from ([T 
the triangle inequality, and the root-mean square-arithmetic mean-geometric mean inequality that 

|x.(fc)-x,(^)| < (max,,vAy,(x(.)))^((l^i^)^+ErX^(fe^ + STC7j)^^ 
C"^''iMP'\ y') < (m., max,,v Ay,(x(fc))) ^ i^k^^+^-j^-^ ( '^-.-^ 1+'^-. ' + '^-.'^f ^.^ ' ) + 

\^ <^rn ■ ■ — 1 I "I" I ^'^171 "1X2" / yfYl " ■ r \ — 

< (mjj maxpgv Al^(x(A;)) ^^J( |A/a^|) ^ • Next, we show that Vi,j G V with 



4 
z 7^ j, each node £ G V — {oi, 02, ... , amij} has at most 3 one-hop neighbors in {ai, 02, ... , amij}- 

Clearly, this statement is true for rriij < 3. For rriij > 4, assume to the contrary that 3i G 
V - {ai,a2,--- ,am,;j} such that TV^ n {oi, 02, ... ,amij} = {an, Oja' • • • '«»«} for some 1 < ii < 
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12 < ■ ■ ■ < in < rriij and n > 4. Then, (ai, . . . , aj^,£, Cj^, . . . , 0^^^) is a path shorter than 
the shortest path {ai,a2, ■ ■ ■ ,am.ij), which is a contradiction. Therefore, the statement is true. 
Consequently, Yl£=i l-^ad < 3(A^ — rriij) + 2{mij — 1) = 3A^ — rriij — 2. It follows that Vi, j G 
V with i ^ j, {xi{k) — Xj{k))'^ < mij(3N — rriij — 2) niaxp^\; AVp{x{k)) . Since rriij < -D + 
1 < N, {xi{k) - Xj{k)f < (3(iV - I) - D){D + 1) maxpev Ayp(x(/c)). Due to this and to 

EiGV Ejev-M-{i} ^i^J = Siev Ejgv ^i^j - /S = N^ - p, we have J2iev Ejev-M~{t} hbj{x^{k) - 
Xj{k)f < (A^2_^)(3(/vr-l)-D)(D + l)maXpgvAyp(x(/c)). This, along with ([33]), dil, and (f26|). 
implies V{^{k)) < 7maxjgv AVi(x(A:)). D 

Because of ([20]), ([SID, and Lemma El we have y(x(/c - 1)) - V{y.{k)) > ^^''^;^"^^^ \/k G P, 
which is exactly ([231). To prove dMD and ([25]), note that ([231) implies F(x(A;)) < (1 - ^)''y(x(0)) 
Vfc e N. Moreover, note from ^ and dUD that V{y.{k)) > (min|ij}g£ C{jj})||x(A;) - xn^f 
VA; G N where minjjjjg^: Cjj j} > ^^^.^ 1^ 1. Furthermore, note from (f3T|) and p^ that F(x(A;)) > 
(miniev^i)llx(fc) - x*l^||2 VA; G N where miuigv ^i > ^(1 + ^^^^)- Thus, dMI and ([25]) 
hold. To derive the bounds on q, notice from (fH|) that ^' / = ^ + (1 + ^ Z]^gM-|i| t/^ + 

1 V 1 Wr 1 -U M < 1 -U n -U rnax^ev|A/"4-l v/(' 2 N ^ Af2-2Ar+2 y<- o ^ c 
2Z^teAG-{i}MI''/TO + W\' - 2 + i^ + min.evlA/'d > ' ^rn^^tevWiO - 2 Vii,j| G i. 

Similarly, it can be shown that ^ / > 1 V{i, j} G £. Hence, a G [1, ^ ~2^"'"^ ]. To derive the bounds 

on /3, observe that /3 < E.^v E,eV ^.^. = N' ■ A^o, E.eV E,eM b^h, > 2L • (i(l + ^g|^))' > 
§(1 + ^)2 and E.ev^' > ^(E.ev&O' = ^- Therefore, /3 G [iV+^(l + ^)2,iv2]. Finally, using 
(I26D, the bounds on a and /?, and the properties L > 7V-1 and [?,{N -l)-D){D + l) < 2N{N-l), 
we obtain 7 G [f + 1, A^^ - 2N'^ + f + 1]. 

A.2 Proof of Theorem [3] 

Lemma 3. V{x{k)) < 7maxjgv A^(x(/c)) V/c G N, where 7 is as in \Sl\ for a path graph with N > A, 
\SS\ for a cycle graph, 16*31 for a K -regular graph with K >2, and \S4\ for a {N, K, A, fi)-strongly regular 
graph with /U > 1. 

Proof. Let k gN. First, suppose ^ is a path graph with iV > 4 and <S = {{1, 2}, {2, 3}, . . . , {iV - 
1, N}}. Note from © , ([E]) , dH , and dH that E{i,j}<,s T.{p,g}eS C{^,J}C{p,g}{x{i,j}{k)-X{p^g}{k)f = 
2NJ2{i j}e£ ^{i.j}(^{i,j}(^) ~ ^*)^- This, along with (fT6|) and ([H]), implies that 

^(x(A;)) = — Yl Yl C{i,j}C{p,g}(x{ij}(A;)-X|p,,}(/c))2 (34) 

^2n{ Y1 Yj (^{m}(^)-^{p,9}(^))^ + 3 Yj i^{i,2}{k) - X{,j}{k)f 
{i,j}e£' {p,q}ee' {i,j}€£' 

+ 3 ^ {x{N-i,N}{k)-X{ij}{k)f + -{x{i^2}{k)-X{N_i^Ny{k)fj, (35) 
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where £' = £ - {{1,2},{A^ - 1,A^}}. Observe from (HD, ([II]), and ^ that (x{i_i^i}(A;) - 
rc{,,,+i}(A;))2 = \^V,{^(k)) Vi G {2, iV - 1} and (x{i_i,,}(A:) - a;|,,,+i}(A;))2 = 2Ay,(x(A;)) Vi G 
{3, 4, ... , A^— 2}. By the root-mean square-arithmetic mean inequahty, ^r^ ig^, ^r „|g£-'(2;{i,j}(^) — 

^{p.gjW)^ = 2E£2^Ejl"^+i(^{M+i}(^)-^{ij+i}(^))^ ^ 2X]il2^Ejl"i+i(j-0ELm(^{^-M}(^)- 
X{,,,+i|(A:))2 = 2(7V - 3)E£3'(^ - ^ - 1)(^ - 2)Ay,(x(A:)). Moreover, 3E{ij}g£'(^{i,2}(fe) - 

^{M}(^))' ^ 3Eil2'(^ - 1)E •=2(^0-1,,} W - ^0,.+i}(^))' = i(^ - 2)(A^ - 3)Ay2(x(fc)) + 
3E«=3'(^ + ^-4)(A^-^-l)AV^.(x(A:)). Similarly, 3E{ij}e£'(^{iV-i,7V}(A;) - X{,,,}(fc))2 < |(Ar- 
2)(iV-3)Ay^_i(x(A:)) + 3E^=3'(2A'-^-3)(^-2)Ayi(x(fe)). Finally, |(x{i,2}(A:)-X{^_i,^}(A;))2 < 
i(A^-2)E£2'(^{.-M}(fc)-^{M+i}W)' = 3(A^-2)(|Ay2(x(fe))+|Ay^(x(fc))+3E£3'Al-iW^^ 
Combining the above with ()35p yields y(x(A;)) < 7maxjgv AVi(x(A;)) where 7 is as in lSli 

Now suppose ^ is a cycle graph with E = {{1, 2}, {2, 3}, . . . , {N — 1, N}, {N, 1}}. Also suppose 
N is odd. Let y € M^ be a permutation of x(A;) such that y{N,i} ^ ^{1,2} ^ y{Af,Af-i} ^ ^{2,3} < 
y{N-i,N-2} < • • • < Vi N-i jv+i -i < ?/f jv+3 iv+i|. Then, since (jM]) holds for any graph and due to (fn|) . 
y(y) = y(x(/c)). Also, due to ([19]) and ^^, maxjgv A^i(y) < maxjgy AVi{x{k)). For convenience, 
let M = 2maxjgvAyj(y) and relabel {y{N,i},y{i,2},y{N,N-i},y{2,3},y{N-i,N-2}, ■ ■ ■ ,y{E^^E±l}, 
y^N+3N+iy) as {zi,Z2,...,zn). Then, we can write V{y) = 2^ Ei=i Ei=i(^i "^j)^ = f(Ci + C'2), 

iV-l iV-3 iV-1 

where Ci = X^jJ^^ (zi - Z2j)^ + (^1 - 2:21+1)^ + {z2i - Z2i+if and C2 = Eiii Ej=j+i(^2i - 
Z2j+if + {Z2i+i - Z2j)'^ + {z2i - Z2jf + {z2i+i - Z2j+if ■ Moreovcr, from ([?]), ([H]), and ([IS]), 
we get 2:2-2:1 < VM, zn - zn^i < VM, and Zj+2 - Zi < \/M Vi e {1, 2, . . . , iV - 2}. Due 
to the property (a — 6)^ + (a — c)^ + (6 — c)^ < 2(a — c)^ Va,6, c G M with a < 6 < c, we 

have Ci < Ei=i 2(^1 — 2;2i+i)^ < Ei=\ 2i^M = — ^^ — ^^- -'-'^ addition, from the property 
(a - d)2 + (6 - c)2 < (a - 6)2 + (a - c)^ + (6 - d)^ + (c - d)^ ya,b,c,d G M, we have C2 < 

JV-3 N-1 N-3 JV-1 

E^Jl E,=Vl2(^2-^2,)' + 2(^^2m-^2,+l)' + (^2.-^2m)' + (^2,-^2, + l)' < E.Jl E,=Vl(4(i- 

j)2M + 2M) = (A^-i)(iV-3)(JV^+ii) jy^_ Combining the above, we obtain V{x{k)) = V{y) < 
^M < 7maxjgv AVi(x(/c)) where 7 is as in [S2j Next, suppose N is even. Similarly, let y E 
R^ be a permutation of x(A;) such that y{N,i} < ^{1,2} < y{N,N-i} < 2/(2,3} < y{N-i,N-2} < 
■■■ < y|iv_i ivi < yrN_^2 ^+1} — VfH Jl+n- Observe from ([M]) . ([Tl]) . and ([T9]) that V{y) = 
V{x{k)) and maxjgy Al^(y) < maxjgv AVi(x(A;)). As before, let M = 2maxjgv ^^^(y) and relabel 

{y{N,l},y{l,2},y{N,N~l},y{2,3}^y{N-l,N~2},- ■ ■ )y{JV_;^ iV|,y|JV^2,^+l}'y{i^,^+l}) ^ {zi,Z2-,. ■ ■ ,Zn)- 

Then, V{y) = ^ Zti Ef=i(^. " ^j? = ^(^1 + ^2 + C3), where Ci = Zli (^1 " ^2^f + (zi - 

——2 — — 1 
Z2i+l)'^ + {z2i-Z2i+lf + {zN-Z2if + {zN-Z2i+l)'^, C2 = EiLl E/=j+l(^2i -2;2j + l)^ + (2;2i+l -22i)^ + 

(2:2* - 2;2j)^ + {z2i+i - Z2j+i)'^, and C3 = {zi - zn)'^- Moreover, 2:2 - 21 < \/M, zn - zn-i < VM, 



and 2j+2 — Zi < \J M \/i G {1,2, . . . ,N — 2}. Using the above properties, it can be shown that 

jv_-| Il—-\ iL~^ 

Ci<Ci + Zili i^2i - Z2i+i? < E^Li 2(21 - Z2i+i)^ + 2{zN - Z2,)2 < Y:iU 2i2M + 2(f - 

i^2M = mE^m^M, C2 < eST' Zf=~lM^ - J?M + 2M) = (N-2)iN-AW^-2N^12) ^^ ^^^ 

C3 < ^-M. It follows that y(x(A:)) < 7maxjgv AVi(x(A;)) where 7 is as in 
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Next, suppose ^ is a /sT-regular graph with K > 2. Due to (fT^ and ^T9\i . ^^^y AFi(x(A;)) = 
E{ij}ef (^i(^) - X{ij}{k)f + {xj{k) - x{ij}{k)f > ^ T.{i,j}esi^iik) - Xj{k)f, implying that 



J] ^(x,(A:) -x,(fe))2 < 2KY,^Vi{^{k)). (36) 

Again, because of ([HD and ([H]), \/i G V, Vj G A/^, (x{jj}(/i;) - Xi(/i;))^ < ^ maxpgv Al^(x(A;)). 
Moreover, Vi G V, Vj,^ G AA^ with j / ^, (3;|,j}(fc) -X{i,^}(/c))2 < 2((x|ij}(/c) -Xi(/c))2 + (x|i_^}(A;) - 
Xj(A;))^) < if maxpgv A^(x(A;)). Via the preceding two inequahties and the root-mean square- 
arithmetic mean inequahty, it can be shown that Vi G V, Vj G V — A/i — {i}, {xi{k) — Xj{k))'^ < 
{D + l){§ maxpev AVpix{k)) • 2 + if maxp^v AVp{x{k)) ■ {D - 1)) = KD{D + 1) maxpgv AVp{x{k)). 
It then follows from ([52]) . ([ED, and ([55]) that y(x(A;)) < 7maxjgv ^^i(x(A;)) where 7 is as inlS31 

Finally, suppose ^ is a (A^, if , A, ;u)-strongly regular graph with /i > 1, which means that it 
is a if -regular graph with if > 2 and with every two non- adjacent nodes having fi common 
neighbors. For every i £ V and j G V — Ai — {i}, let {qiji, qij2, • • • , Qijfj.} = Ai n Mj. Then, from 
m and (USD, /iE.evE,ev-M-w(^^(^) -%(^))' = E.ev E,ev-M-W Eti(^^(^) " %(^))' < 

(^b-,...a(^)-%(^))') ^ 2ifE.6vE,ev-M-w(^'^^W^)) + StiAn,,.(x(A:)) + AV,(x(fc))) < 
2KN{N - K - 1)(2 -F /x)maxpev Al/p(x(fc)). This, along with dMD, (HD. and ([MD, implies that 
y(x(/i;)) < 7maxjgv AVi(x(fc)) where 7 is as in lS4[ D 

Note that in the proof of Theorem [2] in Appendix lA.il Lemma [2] is used to derive (|23p - (j25p . In 
the same way, (j23p - (|25p can be derived using Lemma [3l completing the proof of Theorem [3l 

A. 3 Proof of Theorem |4] 

Let 7 be as in ()26p for a general graph or as in lSlf[S4l for a specific graph. Note that Lemmas [2] 
and [3] are independent of (u(A;))^^ and, thus, hold for CHA as well. Hence, 

F(x(A;)) <7maxAyi(x(A;)), Vfc G N. (37) 

Next, analyzing Algorithm [3] with £{v) = Vt; G [0,oo), we see that 

y(x(l)) = y(x(0)) - max Ay,(x(0)), (38) 

y(x(A; + l)) < y(x(A;))-min{maxAyi(x(A:)),$~Vt(A:))}, VA; G F, (39) 

iev 

t(A;-hl) = max{$(maxAyi(x(/i;))),t(/i;)}, VA; G N. (40) 

With dSZD-dlQD, we now show by induction that VA; G P, V{-x.{k)) < (1 - ^)V"(x(A; - 1)) and 
t{k) < ^( ^(^(fe-^)) ). Let A; = 1. Then, because of ([33), dMD, and (gOD and because $ is strictly 
decreasing, we have y(x(l)) < (1 - ^)V{x{0)) and t(l) = $(max,ev Ay,(x(0))) < <^{^^^^^). 
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Next, let A; > 1 and suppose V{x{k)) < (1 - ^)F(x(A; - 1)) and t{k) < q,(^ v(^{k-i)) y rp^ ^^^^ 
that V{x{k + 1)) < (1 - -)V{x{k)) and t{k + 1) < ^( ^(^(*=)^ ), consider the fohowing two cases: (i) 
maxigv AVi(x(A;)) < $-i(t(fc)) and (ii) maxjev AFi(x(A;)) > $-i(f(A;)). For case (i), due to (^, 
(l39D,and(I10D,wehaveF(x(A; + l)) < F(x(fc))-maxigv AVi(x(/i:)) < (1- i)y(x(fc)) and t(fc + l) = 
$(maxievAT/i(x(A;))) < $(^5£M^). For case (ii), due to dMD, (BSD, and the hypothesis, we have 
F(x(A; + l)) < y(x(fe))-cl>-i(i(^)) < ^(x(A;))-^^^Wti)) < y(x(A;))-^M|i < (i_l)y(x(fc)) and 
t{k + 1) = t(A;) < ^( ^(^(fe-i)) ) < $(1^£(^) < $(XMM). This completes the proof by induction. 
It follows that (I23p and therefore ()24p and (I25p hold, so that Theorems [2] and [3] hold verbatim here. 
Next, observe from (j40p that (t(A;))^g is non-decreasing. To show that linik^oot{k) = oo, assume 
to the contrary that 3i £ (0, cxd) such that t{k) < i\/k G N. For each A; G P, reconsider the above two 
cases. Because of (l39i) and (j40]). for case i'l). V{y.{k))-V{y.{k+l)) > max^gv AFi(x(/c)) = ^-^{t{k+ 
1)) > ^""^(i)- Similarly, for case (ii), V{x{k))-V{x{k+1)) > <^~'^{t{k)) > ^''^{t}- Combining these 
two cases, we get y(x(/c + 1)) < F(x(l)) - k<^-'^{t) V/c G N. Since <^-^ii) > 0, V{x{k + 1)) < for 
sufficiently large k, which is a contradiction. Thus, limfc_!.oo ^(^) = oo. Finally, from the statement 
shown earlier by induction, we obtain V{x{k)) < (1 - ^) • j^^^{t{k)) = (7 - l)^~^{t{k)) \fk G P. 
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Figure 4: Bandwidth/energy efficiency of flooding, PA, CP, A2, DRG, RHA, and CHA on random 
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geometric networks with varying number of nodes N and average number of neighbors -jr. 
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